In the processing of unstructured optic data pretreatment , this article puts forward a completed new algorithm of automatic chinese word segmentation based on context : combining chinese word frequency based on context and keyword dictionary can completes efficiently chinese word automatic segmentation ; and also presents a new computer automatically classification way about optic articles specialty classification : through adding optic subject classification property on optic keywords and computing the proportion of optic keywords in the article , quantificationally classifies optic articles 在非結構化光學數據的結構化處理中,本文提出了一種全新的中文全文漢語自動分詞算法:引入上下文相關的詞頻,結合關鍵字詞典,高效地完成中文漢語自動分詞;另對光學文獻的專業(yè)分類,也提出了一種新的計算機自動分類方法:即通過對增加光學類關鍵字的光學學科分類屬性,計算光學文章中出現的各個所屬光學學科類關鍵字的比例關系,定量地對光學文獻進行分類。